Non-Topical Classification of Query Logs Using Background Knowledge

نویسندگان

  • Isak Taksa
  • Amanda Spink
چکیده

Background knowledge has been actively investigated as a possible means to improve performance of machine learning algorithms. Research has shown that background knowledge plays an especially critical role in three atypical text categorization tasks: short-text classification, limited labeled data, and non-topical classification. This chapter explores the use of machine learning for non-hierarchical classification of search queries, and presents an approach to background knowledge discovery by using information retrieval techniques. Two different sets of background knowledge that were obtained from the World Wide Web, one in 2006 and one in 2009, are used with the proposed approach to classify a commercial corpus of web query data by the age of the user. In the process, various classification scenarios are generated and executed, providing insight into choice, significance and range of tuning parameters, and exploring impact of the dynamic web on classification results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical Taxonomy Extraction by Mining Topical Query Sessions

Search engine logs store detailed information on Web users interactions. Thus, as more and more people use search engines on a daily basis, important trails of users common knowledge are being recorded in those files. Previous research has shown that it is possible to extract concept taxonomies from full text documents, while other scholars have proposed methods to obtain similar queries from q...

متن کامل

QEA: A New Systematic and Comprehensive Classification of Query Expansion Approaches

A major problem in information retrieval is the difficulty to define the information needs of user and on the other hand, when user offers your query there is a vast amount of information to retrieval. Different methods , therefore, have been suggested for query expansion which concerned with reconfiguring of query by increasing efficiency and improving the criterion accuracy in the information...

متن کامل

Toward a Short Text Classification Framework Based on Background Knowledge Discovery

The ubiquitous, diverse and growing impact of digital living creates a massive amount of short text a search query, a twit or a caption. Short text frequently presents itself as an arbitrary combination of semantically unconnected words. Using machine learning to classify the corpora of such texts is a challenging task. A large body of research exists in this area, but in this paper we will foc...

متن کامل

Analysis of varying approaches to topical web query classification

Topical classification of web queries has drawn recent interest from forums such as the 2005 KDD Cup because of the promise it offers in improving retrieval effectiveness and efficiency. Many proposed techniques make use of documents classified in taxonomies (such as the ODP: Open Directory Project – http://www.dmoz.org) to inform on the class of a web query. Implicit in these approaches is the...

متن کامل

Acquiring knowledge about human goals from Search Query Logs

A better understanding of what motivates humans to perform certain actions is relevant for a range of research challenges including generating action sequences that implement goals (planning). A first step in this direction is the task of acquiring knowledge about human goals. In this work, we investigate whether Search Query Logs are a viable source for extracting expressions of human goals. F...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016